Mining Low Dimensionality Data Streams of Continuous Attributes

نویسندگان

  • Francisco J. Ferrer-Troyano
  • Jesús S. Aguilar-Ruiz
  • José Cristóbal Riquelme Santos
چکیده

This paper presents an incremental and scalable learning algorithm in order to mine numeric, low dimensionality, high–cardinality, time–changing data streams. Within the Supervised Learning field, our approach, named SCALLOP, provides a set of decision rules whose size is very near to the number of concepts to be extracted. Experimental results with synthetic databases of different complexity degrees show a good performance from streams of data received at a rapid rate, whose label distribution may not be stationary in time.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Geometric View of Similarity Measures in Data Mining

The main objective of data mining is to acquire information from a set of data for prospect applications using a measure. The concerning issue is that one often has to deal with large scale data. Several dimensionality reduction techniques like various feature extraction methods have been developed to resolve the issue. However, the geometric view of the applied measure, as an additional consid...

متن کامل

Hcluwin: an Algorithm for Clustering Heterogeneous Data Streams over Sliding Windows

Many applications in web usage mining, such as business intelligence and usage characterization, require effective and efficient techniques to discover the users with similar usage patterns and the web pages with correlate contents in the physical world. Clustering click streams can help to achieve the goal. Despite the high processing rate, the existing methods for clustering click streams ove...

متن کامل

A New Algorithm for Optimization of Fuzzy Decision Tree in Data Mining

Decision-tree algorithms provide one of the most popular methodologies for symbolic knowledge acquisition. The resulting knowledge, a symbolic decision tree along with a simple inference mechanism, has been praised for comprehensibility. The most comprehensible decision trees have been designed for perfect symbolic data. Classical crisp decision trees (DT) are widely applied to classification t...

متن کامل

Influence of Stream channel morphology and in-stream habitats on fish community in Golestan province Streams

Four streams with different sizes were selected for studying the effects of environmental factors on fish assemblages using indirect (Detrended Correspondence Analysis, DCA) and direct (Redundancy Analysis, RDA) gradient analysis in Golestan province. DCA of presence-absence and relative abundance data showed well gradient and linear model of species variability. In the within-site RDA, environ...

متن کامل

Preserving Privacy Using Data Perturbation in Data Stream

Data stream can be conceived as a continuous and changing sequence of data that continuously arrive at a system to store or process. Examples of data streams include computer network traffic, phone conversations, web searches and sensor data etc. The data owners or publishers may not be willing to exactly reveal the true values of their data due to various reasons, most notably privacy consider...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003